Goto

Collaborating Authors

 compact generalized non-local network


Compact Generalized Non-local Network

Neural Information Processing Systems

The non-local module is designed for capturing long-range spatio-temporal dependencies in images and videos. Although having shown excellent performance, it lacks the mechanism to model the interactions between positions across channels, which are of vital importance in recognizing fine-grained objects and actions. To address this limitation, we generalize the non-local module and take the correlations between the positions of any two channels into account. This extension utilizes the compact representation for multiple kernel functions with Taylor expansion that makes the generalized non-local module in a fast and low-complexity computation flow. Moreover, we implement our generalized non-local method within channel groups to ease the optimization. Experimental results illustrate the clear-cut improvements and practical applicability of the generalized non-local module on both fine-grained object recognition and video classification.


Reviews: Compact Generalized Non-local Network

Neural Information Processing Systems

This paper proposes a novel network module to exploit global (non-local) correlations in the feature map for improving ConvNets. The authors focus on the weakness of the non-local (NL) module [31] that the correlations across channels are less taken into account, and then formulate the compact generalized non-local (CGNL) module to remedy the issue through summarizing the previous methods of NL and bilinear pooling [14] in a unified manner. The CGNL is evaluated on thorough experiments for action and fine-grained classification tasks, exhibiting promising performance competitive to the state-of-the-arts. Positives: The paper is well organized and easy to follow. The generalized formulation (8,9) to unify bilinear pooling and non-local module is theoretically sound.


Compact Generalized Non-local Network

Yue, Kaiyu, Sun, Ming, Yuan, Yuchen, Zhou, Feng, Ding, Errui, Xu, Fuxin

Neural Information Processing Systems

The non-local module is designed for capturing long-range spatio-temporal dependencies in images and videos. Although having shown excellent performance, it lacks the mechanism to model the interactions between positions across channels, which are of vital importance in recognizing fine-grained objects and actions. To address this limitation, we generalize the non-local module and take the correlations between the positions of any two channels into account. This extension utilizes the compact representation for multiple kernel functions with Taylor expansion that makes the generalized non-local module in a fast and low-complexity computation flow. Moreover, we implement our generalized non-local method within channel groups to ease the optimization.